37 research outputs found

    Sentiment Analysis from Indonesian Twitter Data Using Support Vector Machine And Query Expansion Ranking

    Get PDF
    Sentiment analysis is a computational study of a sentiment opinion and an overflow of feelings expressed in textual form. Twitter has become a popular social network among Indonesians. As a public figure running for president of Indonesia, public opinion is very important to see and consider the popularity of a presidential candidate. Media has become one of the important tools used to increase electability. However, it is not easy to analyze sentiments from tweets on Twitter apps, because it contains unstructured text, especially Indonesian text. The purpose of this research is to classify Indonesian twitter data into positive and negative sentiments polarity using Support Vector Machine and Query Expansion Ranking so that the information contained therein can be extracted and from the observed data can provide useful information for those in need. Several stages in the research include Crawling Data, Data Preprocessing, Term Frequency – Inverse Document Frequency (TF-IDF), Feature Selection Query Expansion Ranking, and data classification using the Support Vector Machine (SVM) method. To find out the performance of this classification process, it will be entered into a configuration matrix. By using a discussion matrix, the results show that calcification using the proposed reached accuracy and F-measure score in 77% and 68% respectively

    Combination of Term Weighting with Class Distribution and Centroid-based Approach for Document Classification

    Get PDF
    A text retrieval system requires a method that is able to return a number of documents with high relevance upon user requests. One of the important stages in the text representation process is the weighting process. The use of Term Frequency (TF) considers the number of word occurrences in each document, while Inverse Document Frequency (IDF) considers the wide distribution of words throughout the document collection. However, the TF-IDF weighting cannot represent the distribution of words to documents with many classes or categories. The more unequal the distribution of words in each category, the more important the word features should be. This study developed a new term weighting method where weighting is carried out based on the frequency of occurrence of terms in each class which is integrated with the distribution of centroid-based terms which can minimize intra-cluster similarity and maximize inter-cluster variance. The ICF.TDCB term weighting method has been able to provide the best results in its application to SVM modeling with a dataset of 931 online news documents. The results show that SVM modeling had accuracy of 0.723, outperforming the use of other term weightings such as TF.IDF, ICF & TDCB

    Peringkasan Berita Online Corona Virus dengan Metode Lexical Chain dan Word Sense Disambiguation

    Get PDF
    Peringkasan berita otomatis merupakan aktivitas mengekstraksi inti dari berita tanpa mengurangi makna penting yang terdapat dalam berita tersebut. Dalam peringkasan berita otomatis terdapat beberapa metode yang dapat digunakan salah satunya yaitu metode Lexical Chain. Metode ini memiliki kinerja yang baik dalam peringkasan teks dengan cara menentukan chain tertinggi. Namun, metode ini memiliki kelemahan yaitu tidak bisa mengidentifikasi kata ambigu yang terdapat pada kalimat berita. Oleh karena itu, untuk memperbaiki kekurangan dari kelemahan metode Lexical Chain maka pada penelitian ini dilengkapi dengan Word Sense Disambiguation untuk mengidentifikasi kata ambigu. Penelitian ini menggunakan 100 berita tentang Covid-19 yang bersumber dari portal berita online terpopuler. Pengujian akurasi peringkasan berita otomatis yang digunakan dalam penelitian ini menggunakan Recall-Oriented Understudy for Gisting Evaluation (ROUGE). Adapun evaluasi yang digunakan pada penelitian ini ada tiga macam yaitu precission, recall, dan f-measure. Hasil evaluasi diperoleh nilai rata-rata precission sebesar 0,62, recall sebesar 0,20, dan f-measure sebesar 0,30.Automatic news summarizes are the activity of extracting the core from the news without compromising the significance contained in the news. In the automatic news release there are several methods that can be used, one of which is the Lexical Chain method. This method performs well in summarizes text by determining the highest chain. However, this method has the disadvantage of not being able to identify ambiguous words contained in news sentences. Therefore, to correct the shortcomings of the Lexical Chain method weaknesses, the study was equipped with Word Sense Disambiguation to identify ambiguous words. This study used 100 news about Covid-19 sourced from the most popular online news portals. Accuracy testing of automatic news releases used in this study using Recall-Oriented Understudy for Gisting Evaluation (ROUGE), while the evaluation used in this study there are three kinds of precission, recall, and f-measure. The evaluation results obtained an average precission value of 0.62, recall of 0.20, and f-measure of 0.30

    Implementasi Website Profil Madrasah Muhammadiyah Al-Munawarroh Malang Sebagai Media Informasi Bagi Masyarakat

    Get PDF
    Madrasah Muhammadiyah Al-Munawarroh adalah madrasah yang terletak di Malang dan salah satu madrasah yang berkembang. Untuk menunjang perkembangan tersebut, dibuat sebuah website profile yang dapat memberikan informasi kepada masyarakat secara cepat. Metode yang digunakan dalam pembuatan website adalah metode RAD (Rapid Application Development). Metode ini digunakan karena kami ingin melibatkan pihak madrasah dalam pembuatan website, agar website yang dibuat dapat bermanfaat secara penuh dan sesuai dengan kebutuhan penyebaran informasi bagi madrasah. Hasil dari pembuatan website profile menunjukkan bahwa pihak madrasah dapat dengan mudah menyebarkan informasi-informasi penting seperti jadwal pendaftaran siswa, informasi mengenai madrasah hingga informasi lowongan pekerjaan yang ada di madrasah